Add asv benchmark jobs to CI #5796

Illviljan · 2021-09-14T22:00:49Z

Workflow based on the version from scikit-image. Modfied to have asv.conf.json inside a subdirectory and triggers every push if the PR has the run-benchmark label.

Notes:

https://github.com/scikit-image/scikit-image doesn't have the same benchmark folder setup, for example config file is in root directory, other folder names.
https://github.com/numpy/numpy has same folder name as sckit-image. config file is in the folder however.

References:

Tests checked:

TODO:

self.setup_cache

Related to Comprehensive benchmarking suite #4648
Tests added
Passes pre-commit run --all-files
User visible changes (including notable bug fixes) are documented in whats-new.rst

github-actions · 2021-09-14T22:08:28Z

Unit Test Results

        6 files         6 suites 1h 0m 16s ⏱️
16 226 tests 14 490 ✔️ 1 736 💤 0 ❌
90 552 runs 82 372 ✔️ 8 180 💤 0 ❌

Results for commit 70cd679.

♻️ This comment has been updated with latest results.

max-sixty · 2021-09-15T00:15:49Z

Awesome!! Thanks a lot @Illviljan !

On how we run these — I would be fine with the labeled approach that scikit-learn uses.

What else do you think is needed for this? It already looks excellent, thank you.

Illviljan · 2021-09-15T05:15:29Z

Here's some things besides not getting green ticks @max-sixty:

Long times, +240 minutes. scikit had it down to like 13min somehow. What's the bottleneck? Which tests are super slow?
A lot of printed errors which makes the report very messy, go through each test and fix those. Not necessary for this PR though.
asv.conf.json has been moved. If you have any idea how to trigger this job inside asv_bench that would be nice.
Should we align benchmark folder names? Numpy?
Add more required installs? bottleneck for examples crashes tests because it isn't installed. Add all the non-required as well?
Can a normal user add and remove labels? I set this up so that I don't have to deal with the nightmare of setting up asv locally. Would be nice to avoid having to ask you over and over again to add/remove labels. To trigger it.

dcherian · 2021-09-15T16:53:10Z

Long times, +240 minutes

yeah this is a problem.

Maybe we need to go through and mark the slow tests like in the README_ci.md document

In that vein, a new private function is defined at benchmarks.__init__: _skip_slow. This will check if the ASV_SKIP_SLOW environment variable has been defined. If set to 1, it will raise NotImplementedError and skip the test. To implement this behavior in other tests, you can add the following attribute:

Add more required installs? bottleneck for examples crashes tests because it isn't installed. Add all the non-required as well?

Do you mean skipping those that require bottleneck rather than relying on asv to handle the crash? If so, sounds good

Can a normal user add and remove labels?

An alternative approach would be to use something similar to our [skip-ci] and [test-upstream] tags in the commit message. Though I think that tag needs to be on every commit you want benchmarked.

dcherian · 2021-09-15T17:02:44Z

Ah one major problem IIUC is that we run a lot of benchmarks with and without bottleneck even if bottleneck isn't involved in the operation being benchmarked.

Some of the following should be fixed

ImportError: Pandas requires version '0.12.3' or newer of 'xarray' (version '0.0.0' currently installed).

rolling.Rolling.time_rolling_construct
xarray.core.merge.MergeError: conflicting values for variable 'x_coords' on objects to be combined. You can skip this check by specifying compat='override'.

IOWriteNetCDFDaskDistributed.time_write
Looks like we're spinning up multiple dask clusters.
UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 33423 instead
warnings.warn(

Illviljan · 2021-09-15T17:37:48Z

Do you mean skipping those that require bottleneck rather than relying on asv to handle the crash? If so, sounds good

Hmm, I was just thinking installing all possible dependencies. But I think I've simply misunderstood the tests and why they were crashing.

The Benchmark no label triggered seems to have succeded but still failed? Anyone understands what the error means?

 [100.00%] ··· unstacking.UnstackingDask.time_unstack_slow             32.7±0.3ms

BENCHMARKS NOT SIGNIFICANTLY CHANGED.
+ grep 'Traceback \|failed\|PERFORMANCE DECREASED' benchmarks.log
+ exit 1
++ '[' 1 = 1 ']'
++ '[' -x /usr/bin/clear_console ']'
++ /usr/bin/clear_console -q
Error: Process completed with exit code 1.

Edit: I think I understand now. We get a bunch of Traceback errors, that's why it's erroring even though the performance didn't change.

Illviljan · 2021-09-15T17:49:01Z

@dcherian did you have something specific in mind when you linked to https://github.com/jaimergp/scikit-image/blob/main/.github/workflows/benchmarks-cron.yml in #4648? I think I'll remove it otherwise and just focus on

Benchmark label triggered - from scikit
Benchmark no label triggered - from scikit modded by me.

Illviljan · 2021-09-15T19:16:11Z

There we go. I think the workflow works as intended now we'll see in 3-4 hours.
Only thing left is to improve the tests which can be done in other PRs and maybe rewrite that README slightly.

Illviljan · 2021-10-07T17:59:54Z

Here's how long dataarray_missing.py takes in this workflow with different shapes:
321a761 - shape=(100, 25, 25), 3 minutes
8f262f9 - shape=(365, 50, 50), 3m 28s
0b7b1a0 - shape=(365, 75, 75), 4m 8s
8f08506 - shape=(365, 100, 100), 5m 47s
d1b908a - shape=(365, 200, 400) , 12m 38s
56556f1 - shape=(3650, 100, 100) 19m 55s and crashes
1eba65c - shape=(3650, 200, 400), 20 minutes and crashes

Changed the shape to shape=(365, 75, 75) as that seems to be around tipping point where it starts slowing down.

dcherian · 2021-10-24T10:07:49Z

Thanks @Illviljan this is great work!

* upstream/main: Only run asv benchmark when labeled (pydata#5893) Add asv benchmark jobs to CI (pydata#5796) Remove use of deprecated `kind` argument in `CFTimeIndex` tests (pydata#5723) Single matplotlib import (pydata#5794) Check jupyter nbs with black in pre-commit (pydata#5891)

Also fix many benchmarks.

Illviljan added 6 commits September 14, 2021 23:01

Create benchmark-cron.yml

ff0b563

move config

bfb317c

Create benchmarks.yml

6a3a91e

Update asv.conf.json

718ab63

Update asv.conf.json

02daf51

Create benchmarks-label-triggered.yml

b5aebef

Illviljan marked this pull request as draft September 14, 2021 22:02

newlines at end

156a6f8

Illviljan added 5 commits September 15, 2021 00:10

Update benchmark_cron.yml

448d92c

Delete benchmark_cron.yml

03c5c63

Update benchmark-cron.yml

2465696

rename

13a1cee

Create benchmarks-no-label-triggered.yml

9f79f15

Create README_CI.md

5630aa2

Update benchmarks-no-label-triggered.yml

30c801e

Illviljan added 7 commits September 15, 2021 20:33

Remove some test workflows.

069acaa

Update benchmarks-no-label-triggered.yml

3b15f98

Update benchmarks-no-label-triggered.yml

19727fb

Update asv.conf.json

8e9db9e

Update asv.conf.json

7fdb4bd

Remove label triggered workflow

07b3b39

Update benchmarks.yml

eac0236

Illviljan added 15 commits October 5, 2021 20:54

Update benchmarks.yml

6e73f9b

Update benchmarks.yml

24ca03e

remove ccache

5c36e26

Update benchmarks.yml

919d794

Try something else than mamba

91edc08

Update benchmarks.yml

318e99e

Update benchmarks.yml

6a2b855

test missing again

321a761

Update dataarray_missing.py

1eba65c

Update dataarray_missing.py

56556f1

Update dataarray_missing.py

8f08506

Update dataarray_missing.py

d1b908a

Update dataarray_missing.py

8f262f9

Update dataarray_missing.py

0b7b1a0

Update dataarray_missing.py

712a453

add back tests

70cd679

dcherian merged commit 26e2e61 into pydata:main Oct 24, 2021

Illviljan mentioned this pull request Oct 24, 2021

Only run asv benchmark when labeled #5893

Merged

Illviljan mentioned this pull request Oct 31, 2021

Add groupby & resample benchmarks #5922

Merged

1 task

aaronspring mentioned this pull request Nov 20, 2021

asv in GHA on request pangeo-data/climpred#664

Closed

Illviljan mentioned this pull request Jan 21, 2022

BENCH: consistently test benchmarks (specifically argmax/argmin) numpy/numpy#20785

Open

mhvk mentioned this pull request Jan 21, 2022

Automate benchmarking? astropy/astropy#12767

Closed

This was referenced Jan 24, 2022

Add airspeed velocity benchmarks deepcharles/ruptures#231

Open

test: Add asv benchmark jobs to CI deepcharles/ruptures#234

Open

snowman2 pushed a commit to snowman2/xarray that referenced this pull request Feb 9, 2022

Add asv benchmark jobs to CI (pydata#5796)

2791798

Also fix many benchmarks.

matthewfeickert mentioned this pull request Mar 18, 2022

research: time metrics with honeycomb scikit-hep/pyhf#1115

Open

Illviljan deleted the asv-benchmark-cron branch August 12, 2022 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add asv benchmark jobs to CI #5796

Add asv benchmark jobs to CI #5796

Illviljan commented Sep 14, 2021 •

edited

Loading

github-actions bot commented Sep 14, 2021 •

edited

Loading

max-sixty commented Sep 15, 2021

Illviljan commented Sep 15, 2021 •

edited

Loading

dcherian commented Sep 15, 2021

dcherian commented Sep 15, 2021

Illviljan commented Sep 15, 2021 •

edited

Loading

Illviljan commented Sep 15, 2021

Illviljan commented Sep 15, 2021 •

edited

Loading

Illviljan commented Oct 7, 2021 •

edited

Loading

dcherian commented Oct 24, 2021

Add asv benchmark jobs to CI #5796

Add asv benchmark jobs to CI #5796

Conversation

Illviljan commented Sep 14, 2021 • edited Loading

github-actions bot commented Sep 14, 2021 • edited Loading

Unit Test Results

max-sixty commented Sep 15, 2021

Illviljan commented Sep 15, 2021 • edited Loading

dcherian commented Sep 15, 2021

dcherian commented Sep 15, 2021

Illviljan commented Sep 15, 2021 • edited Loading

Illviljan commented Sep 15, 2021

Illviljan commented Sep 15, 2021 • edited Loading

Illviljan commented Oct 7, 2021 • edited Loading

dcherian commented Oct 24, 2021

Illviljan commented Sep 14, 2021 •

edited

Loading

github-actions bot commented Sep 14, 2021 •

edited

Loading

Illviljan commented Sep 15, 2021 •

edited

Loading

Illviljan commented Sep 15, 2021 •

edited

Loading

Illviljan commented Sep 15, 2021 •

edited

Loading

Illviljan commented Oct 7, 2021 •

edited

Loading